AITopics | in-context accuracy

Collaborating Authors

in-context accuracy

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

completion

Neural Information Processing SystemsApr-27-2026, 23:23:48 GMT

Algorithm 2 describes the prompt completion algorithm introduced in Section 2.2. It implicitly401 considers a single action, which takes the next sequence element.402 Algorithm 2 - Prompt completion Input: Grounded schema {T,C,Erb}with rebound CSCG emission matrix Erb, delimiter token x, prompt x(prompt) = (x1,...,xm) Output: A completed prompt x(prompt completed) = (x1,...,xm,xm+1,...,xm+p = x) 1: Run max-product for MAP inference and return zMAP = (z1,...,zm) = argmaxz p(z|x(prompt)). Algorithm 3 is a variant of the rebinding Algorithm 1 that does not use EM. Instead, it first searches404 for "surprising observations": a surprise has a low probability of being emitted by its decoded clone.405

artificial intelligence, in-context accuracy, matrix, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.72)

Add feedback

Schema-learning and rebinding as mechanisms of in-context learning and emergence

Neural Information Processing SystemsApr-27-2026, 23:23:45 GMT

In-context learning (ICL) is one of the most powerful and most unexpected capabilities to emerge in recent transformer-based large language models (LLMs). Yet the mechanisms that underlie it are poorly understood. In this paper, we demonstrate that comparable ICL capabilities can be acquired by an alternative sequence prediction learning method, namely clone-structured causal graphs (CSCGs). A key property of CSCGs is that, unlike transformer-based LLMs, they are interpretable, which considerably simplifies the task of explaining how ICL works. We show that ICL in CSCG uses a combination of (a) learning template (schema) circuits for pattern completion, (b) retrieving relevant templates in a context-sensitive manner, and (c) rebinding novel tokens to appropriate slots in the templates. We go on to marshall evidence for the hypothesis that similar mechanisms underlie ICL in LLMs. For example, we find that, with CSCGs as with LLMs, different capabilities emerge at different levels of overparameterization, suggesting that overparameterization helps in learning more complex template (schema) circuits. By showing how ICL can be achieved with small models and datasets, we open up a path to novel architectures, and take a vital step towards a more general understanding of the mechanics behind this important capability.

large language model, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country: Europe > Austria (0.28)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

5bc3356e0fa1753fff7e8d6628e71b22-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-12-2026, 07:42:40 GMT

artificial intelligence, in-context accuracy, matrix, (18 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.71)

Add feedback

Schema-learning and rebinding as mechanisms of in-context learning and emergence

Neural Information Processing SystemsFeb-12-2026, 07:42:36 GMT

In-context learning (ICL) is one of the most powerful and most unexpected capabilities to emerge in recent transformer-based large language models (LLMs).

large language model, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country:

Europe > Austria > Vienna (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

Schema-learning and rebinding as mechanisms of in-context learning and emergence

Swaminathan, Sivaramakrishnan, Dedieu, Antoine, Raju, Rajkumar Vasudeva, Shanahan, Murray, Lazaro-Gredilla, Miguel, George, Dileep

arXiv.org Artificial IntelligenceJun-15-2023

In-context learning (ICL) is one of the most powerful and most unexpected capabilities to emerge in recent transformer-based large language models (LLMs). Yet the mechanisms that underlie it are poorly understood. In this paper, we demonstrate that comparable ICL capabilities can be acquired by an alternative sequence prediction learning method using clone-structured causal graphs (CSCGs). Moreover, a key property of CSCGs is that, unlike transformer-based LLMs, they are {\em interpretable}, which considerably simplifies the task of explaining how ICL works. Specifically, we show that it uses a combination of (a) learning template (schema) circuits for pattern completion, (b) retrieving relevant templates in a context-sensitive manner, and (c) rebinding of novel tokens to appropriate slots in the templates. We go on to marshall evidence for the hypothesis that similar mechanisms underlie ICL in LLMs. For example, we find that, with CSCGs as with LLMs, different capabilities emerge at different levels of overparameterization, suggesting that overparameterization helps in learning more complex template (schema) circuits. By showing how ICL can be achieved with small models and datasets, we open up a path to novel architectures, and take a vital step towards a more general understanding of the mechanics behind this important capability.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2307.01201

Country:

Europe > Austria > Vienna (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.64)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback